Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Split hash-based groupby into multiple smaller files to reduce build time #17089

Merged
merged 26 commits into from
Oct 19, 2024

Conversation

PointKernel
Copy link
Member

Description

This work is part of splitting the original bulk shared memory groupby PR #16619.

This PR splits the hash-based groupby file into multiple translation units and uses explicit template instantiations to help reduce build time. It also includes some minor cleanups without significant functional changes.

Checklist

  • I am familiar with the Contributing Guidelines.
  • New or existing tests cover these changes.
  • The documentation is up to date with these changes.

@PointKernel PointKernel added 3 - Ready for Review Ready for review by team libcudf Affects libcudf (C++/CUDA) code. CMake CMake build issue improvement Improvement / enhancement to an existing function non-breaking Non-breaking change labels Oct 15, 2024
@PointKernel PointKernel self-assigned this Oct 15, 2024
@PointKernel PointKernel requested review from a team as code owners October 15, 2024 18:06
@PointKernel
Copy link
Member Author

I'm open to further splitting this PR, but most of the changes are simply moving functions and classes into their own translation units and adding explicit instantiations.

Copy link
Contributor

@KyleFromNVIDIA KyleFromNVIDIA left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved trivial CMake changes

@PointKernel PointKernel requested a review from ttnghia October 17, 2024 19:59
@mhaseeb123 mhaseeb123 added the cuco cuCollections related issue label Oct 17, 2024
Copy link
Member

@mhaseeb123 mhaseeb123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple comments. Looks good otherwise.

@davidwendt
Copy link
Contributor

The build metrics report for this PR so far shows all the groupby/hash/ source files now compile less than a minute.
While the original groupyby/hash/groupby.cu took 7 minutes to compile. reference link

@PointKernel
Copy link
Member Author

/merge

@rapids-bot rapids-bot bot merged commit 074ab74 into rapidsai:branch-24.12 Oct 19, 2024
102 checks passed
@PointKernel PointKernel deleted the dispatch-groupby branch October 19, 2024 21:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team CMake CMake build issue cuco cuCollections related issue improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants